A semi-automated methodology for finding lipid-related GO terms

نویسندگان

  • Mengyuan Fan
  • Hong Sang Low
  • Markus R. Wenk
  • Limsoon Wong
چکیده

MOTIVATION Although semantic similarity in Gene Ontology (GO) and other approaches may be used to find similar GO terms, there is yet a method to systematically find a class of GO terms sharing a common property with high accuracy (e.g., involving human curation). RESULTS We have developed a methodology to address this issue and applied it to identify lipid-related GO terms, owing to the important and varied roles of lipids in many biological processes. Our methodology finds lipid-related GO terms in a semi-automated manner, requiring only moderate manual curation. We first obtain a list of lipid-related gold-standard GO terms by keyword search and manual curation. Then, based on the hypothesis that co-annotated GO terms share similar properties, we develop a machine learning method that expands the list of lipid-related terms from the gold standard. Those terms predicted most likely to be lipid related are examined by a human curator following specific curation rules to confirm the class labels. The structure of GO is also exploited to help reduce the curation effort. The prediction and curation cycle is repeated until no further lipid-related term is found. Our approach has covered a high proportion, if not all, of lipid-related terms with relatively high efficiency. DATABASE URL http://compbio.ddns.comp.nus.edu.sg/∼lipidgo.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LipidGO: database for lipid-related GO terms and applications

MOTIVATION Lipid, an essential class of biomolecules, is receiving increasing attention in the research community, especially with the development of analytical technique like mass spectrometry. Gene Ontology (GO) is the de facto standard function annotation scheme for gene products. Identification of both explicit and implicit lipid-related GO terms will help lipid research in many ways, e.g. ...

متن کامل

Slim-o-matic: a Semi-Automated Way to Generate Gene Ontology Slims

The Gene Ontology (GO) currently contains over 40,000 terms describing the locations, activities and processes of gene products. Several millions of gene products have been annotated using the GO, and these annotations are routinely used for multiple applications. However, because of the di↵erence of granularity in the annotations, it is useful to summarize GO annotations using GO slims. GO sli...

متن کامل

Cell, Chemical and Anatomical Views of the Gene Ontology: Mapping to a Roche Controlled Vocabulary

The Gene Ontology (GO) consists of around 40,000 terms refering to classes of biological process, cell component and gene product activity. It has been used to annotate the functions and locations of several million gene products. Much pharmacological research focuses on understanding how disease conditions differ from physiological conditions in molecular terms with the aim of finding new drug...

متن کامل

Cost Function Modelling for Semi-automated SC, RTG and Automated and Semi-automated RMG Container Yard Operating Systems

This study analyses the concept of cost functions for semi-automated Straddle Carrier (SC), Rubber Tyred Gantry (RTG) and automated Rail Mounted Gantry (RMG) container yard operating cranes. It develops a generic cost based model for a pair-wise comparison, analysis and evaluation of economic efficiency and effectiveness of container yard equipment to be used for decision-making by terminal pla...

متن کامل

Unsupervised Information Extraction for Finding Gene Functions

Finding gene functions discussed in a literature is imperative to information extraction from biomedical documents. Automated, computational methodologies can reduce the need for manual curation significantly and improve quality of other related Information Extraction (IE) systems. We propose an open information extraction method for BioCreative IV GO shared task (Subtask b)—a workshop designed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2014  شماره 

صفحات  -

تاریخ انتشار 2014